智能论文笔记

Classification of COVID-19 in Chest X-ray Images Using Fusion of Deep Features and LightGBM

Hamid Nasiri , Ghazal Kheyroddin , Morteza Dorrigiv , Mona Esmaeili , Amir Raeisi Nafchi , Mohsen Haji Ghorbani , Payman Zarkesh-Ha

分类：计算机视觉

2022-06-09

Covid-19疾病最初是在中国武汉发现的，并在全球迅速传播。在COVID-19大流行之后，许多研究人员已经开始确定一种使用胸部X射线图像诊断COVID-19的方法。这种疾病的早期诊断会显着影响治疗过程。在本文中，我们提出了一种比文献中报道的其他方法更快，更准确的新技术。提出的方法结合了Densenet169和Mobilenet深神经网络的组合来提取患者X射线图像的特征。使用单变量特征选择算法，我们为最重要的功能完善了功能。然后，我们将选定的功能应用于LightGBM（轻梯度增强机）算法进行分类。为了评估所提出方法的有效性，使用了包括患者胸部的1125张X射线图像的ChestX-Ray8数据集。所提出的方法分别达到了两级（Covid-19，健康）和多级（Covid-19，健康，肺炎）分类问题的98.54％和91.11％的精度。值得一提的是，我们已经使用了梯度加权类激活映射（GRAD-CAM）进行进一步分析。

translated by 谷歌翻译

Machine Learning Approach to Polymerization Reaction Engineering: Determining Monomers Reactivity Ratios

Tung Nguyen , Mona Bavarian

分类：机器学习

2023-01-03

Here, we demonstrate how machine learning enables the prediction of comonomers reactivity ratios based on the molecular structure of monomers. We combined multi-task learning, multi-inputs, and Graph Attention Network to build a model capable of predicting reactivity ratios based on the monomers chemical structures.

translated by 谷歌翻译

ALERT: Adapting Language Models to Reasoning Tasks

Ping Yu , Tianlu Wang , Olga Golovneva , Badr Alkhamissy , Gargi Ghosh , Mona Diab , Asli Celikyilmaz

分类：自然语言处理

2022-12-16

Current large language models can perform reasonably well on complex tasks that require step-by-step reasoning with few-shot learning. Are these models applying reasoning skills they have learnt during pre-training and reason outside of their training context, or are they simply memorizing their training corpus at finer granularity and have learnt to better understand their context? To tease apart these possibilities, we introduce ALERT, a benchmark and suite of analyses for assessing language models' reasoning ability comparing pre-trained and finetuned models on complex tasks that require reasoning skills to solve. ALERT provides a test bed to asses any language model on fine-grained reasoning skills, which spans over 20 datasets and covers 10 different reasoning skills. We leverage ALERT to further investigate the role of finetuning. With extensive empirical analysis we find that language models learn more reasoning skills such as textual entailment, abductive reasoning, and analogical reasoning during finetuning stage compared to pretraining state. We also find that when language models are finetuned they tend to overfit to the prompt template, which hurts the robustness of models causing generalization problems.

translated by 谷歌翻译

CREPE: Can Vision-Language Foundation Models Reason Compositionally?

Zixian Ma , Jerry Hong , Mustafa Omer Gul , Mona Gandhi , Irena Gao , Ranjay Krishna

分类：自然语言处理 | 计算机视觉

2022-12-13

A fundamental characteristic common to both human vision and natural language is their compositional nature. Yet, despite the performance gains contributed by large vision and language pretraining, we find that - across 6 architectures trained with 4 algorithms on massive datasets - they exhibit little compositionality. To arrive at this conclusion, we introduce a new compositionality evaluation benchmark CREPE which measures two important aspects of compositionality identified by cognitive science literature: systematicity and productivity. To measure systematicity, CREPE consists of three test datasets. The three test sets are designed to test models trained on three of the popular training datasets: CC-12M, YFCC-15M, and LAION-400M. They contain 385K, 385K, and 373K image-text pairs and 237K, 210K, and 178K hard negative captions. To test productivity, CREPE contains 17K image-text pairs with nine different complexities plus 246K hard negative captions with atomic, swapping, and negation foils. The datasets are generated by repurposing the Visual Genome scene graphs and region descriptions and applying handcrafted templates and GPT-3. For systematicity, we find that model performance decreases consistently when novel compositions dominate the retrieval set, with Recall@1 dropping by up to 8%. For productivity, models' retrieval success decays as complexity increases, frequently nearing random chance at high complexity. These results hold regardless of model and training dataset size.

translated by 谷歌翻译

Snapshot Multispectral Imaging Using a Diffractive Optical Network

Deniz Mengu , Anika Tabassum , Mona Jarrahi , Aydogan Ozcan

分类：计算机视觉

2022-12-10

Multispectral imaging has been used for numerous applications in e.g., environmental monitoring, aerospace, defense, and biomedicine. Here, we present a diffractive optical network-based multispectral imaging system trained using deep learning to create a virtual spectral filter array at the output image field-of-view. This diffractive multispectral imager performs spatially-coherent imaging over a large spectrum, and at the same time, routes a pre-determined set of spectral channels onto an array of pixels at the output plane, converting a monochrome focal plane array or image sensor into a multispectral imaging device without any spectral filters or image recovery algorithms. Furthermore, the spectral responsivity of this diffractive multispectral imager is not sensitive to input polarization states. Through numerical simulations, we present different diffractive network designs that achieve snapshot multispectral imaging with 4, 9 and 16 unique spectral bands within the visible spectrum, based on passive spatially-structured diffractive surfaces, with a compact design that axially spans ~72 times the mean wavelength of the spectral band of interest. Moreover, we experimentally demonstrate a diffractive multispectral imager based on a 3D-printed diffractive network that creates at its output image plane a spatially-repeating virtual spectral filter array with 2x2=4 unique bands at terahertz spectrum. Due to their compact form factor and computation-free, power-efficient and polarization-insensitive forward operation, diffractive multispectral imagers can be transformative for various imaging and sensing applications and be used at different parts of the electromagnetic spectrum where high-density and wide-area multispectral pixel arrays are not widely available.

translated by 谷歌翻译

Unidirectional Imaging using Deep Learning-Designed Materials

Jingxi Li , Tianyi Gan , Yifan Zhao , Bijie Bai , Che-Yung Shen , Songyu Sun , Mona Jarrahi , Aydogan Ozcan

分类：计算机视觉

2022-12-05

A unidirectional imager would only permit image formation along one direction, from an input field-of-view (FOV) A to an output FOV B, and in the reverse path, the image formation would be blocked. Here, we report the first demonstration of unidirectional imagers, presenting polarization-insensitive and broadband unidirectional imaging based on successive diffractive layers that are linear and isotropic. These diffractive layers are optimized using deep learning and consist of hundreds of thousands of diffractive phase features, which collectively modulate the incoming fields and project an intensity image of the input onto an output FOV, while blocking the image formation in the reverse direction. After their deep learning-based training, the resulting diffractive layers are fabricated to form a unidirectional imager. As a reciprocal device, the diffractive unidirectional imager has asymmetric mode processing capabilities in the forward and backward directions, where the optical modes from B to A are selectively guided/scattered to miss the output FOV, whereas for the forward direction such modal losses are minimized, yielding an ideal imaging system between the input and output FOVs. Although trained using monochromatic illumination, the diffractive unidirectional imager maintains its functionality over a large spectral band and works under broadband illumination. We experimentally validated this unidirectional imager using terahertz radiation, very well matching our numerical results. Using the same deep learning-based design strategy, we also created a wavelength-selective unidirectional imager, where two unidirectional imaging operations, in reverse directions, are multiplexed through different illumination wavelengths. Diffractive unidirectional imaging using structured materials will have numerous applications in e.g., security, defense, telecommunications and privacy protection.

translated by 谷歌翻译

MONAI: An open-source framework for deep learning in healthcare

M. Jorge Cardoso , Wenqi Li , Richard Brown , Nic Ma , Eric Kerfoot , Yiheng Wang , Benjamin Murrey , Andriy Myronenko , Can Zhao , Dong Yang

分类：机器学习 | 人工智能 | 计算机视觉

2022-11-04

Artificial Intelligence (AI) is having a tremendous impact across most areas of science. Applications of AI in healthcare have the potential to improve our ability to detect, diagnose, prognose, and intervene on human disease. For AI models to be used clinically, they need to be made safe, reproducible and robust, and the underlying software framework must be aware of the particularities (e.g. geometry, physiology, physics) of medical data being processed. This work introduces MONAI, a freely available, community-supported, and consortium-led PyTorch-based framework for deep learning in healthcare. MONAI extends PyTorch to support medical data, with a particular focus on imaging, and provide purpose-specific AI model architectures, transformations and utilities that streamline the development and deployment of medical AI models. MONAI follows best practices for software-development, providing an easy-to-use, robust, well-documented, and well-tested software framework. MONAI preserves the simple, additive, and compositional approach of its underlying PyTorch libraries. MONAI is being used by and receiving contributions from research, clinical and industrial teams from around the world, who are pursuing applications spanning nearly every aspect of healthcare.

translated by 谷歌翻译

NVIDIA FLARE: Federated Learning from Simulation to Real-World

Holger R. Roth , Yan Cheng , Yuhong Wen , Isaac Yang , Ziyue Xu , Yuan-Ting Hsieh , Kristopher Kersten , Ahmed Harouni , Can Zhao , Kevin Lu

分类：机器学习 | 人工智能 | 计算机视觉

2022-10-24

Federated learning (FL) enables the building of robust and generalizable AI models by leveraging diverse datasets from multiple collaborators without centralizing the data. We created NVIDIA FLARE as an open-source software development kit (SDK) to make it easier for data scientists to use FL in their research and real-world applications. The SDK includes solutions for state-of-the-art FL algorithms and federated machine learning approaches, which facilitate building workflows for distributed learning across enterprises and enable platform developers to create a secure, privacy-preserving offering for multiparty collaboration utilizing homomorphic encryption or differential privacy. The SDK is a lightweight, flexible, and scalable Python package, and allows researchers to bring their data science workflows implemented in any training libraries (PyTorch, TensorFlow, XGBoost, or even NumPy) and apply them in real-world FL settings. This paper introduces the key design principles of FLARE and illustrates some use cases (e.g., COVID analysis) with customizable FL workflows that implement different privacy-preserving algorithms. Code is available at https://github.com/NVIDIA/NVFlare.

translated by 谷歌翻译

Using Unmanned Aerial Systems (UAS) for Assessing and Monitoring Fall Hazard Prevention Systems in High-rise Building Projects

Yimeng Li , Behzad Esmaeili , Masoud Gheisari , Jana Kosecka , Abbas Rashidi

分类：机器人

2022-09-27

这项研究开发了一个无人驾驶系统（UASS）的框架，以监测高层建筑项目中未受保护的边缘和开口附近的跌落危险系统。开发并测试了一个三步基于机器学习的框架，以检测UAS捕获的图像的护栏柱。首先，对护栏探测器进行了培训，以定位支撑护栏的职位的候选位置。由于从实际的工作现场收集的此过程中使用了图像，因此确定了几个错误检测。因此，在以下步骤中引入了其他约束，以滤除错误检测。其次，研究团队将水平线检测器应用于图像，以正确检测地板并删除离地板不近的检测。最后，由于每个帖子之间安装了护栏柱，它们之间的分布差异大致，因此它们之间的空间被估算并用于找到两个帖子之间最有可能的距离。研究团队使用了开发方法的各种组合来监视高层建筑项目的捕获图像中的护栏系统。比较精度和召回指标表明，级联分类器通过落地检测和护栏间距估计来取得更好的性能。研究结果表明，拟议的护栏识别系统可以改善护栏的评估，并促进安全工程师确定高层建筑项目中跌落危害的任务。

translated by 谷歌翻译

A Distributed Acoustic Sensor System for Intelligent Transportation using Deep Learning

Chia-Yen Chiang , Mona Jaber , Peter Hayward

分类：机器学习

2022-09-13

智能运输系统（ITS）对可持续和绿色城市生活的发展至关重要。它是数据驱动的，并通过从气管到智能相机的传感器大量来启用。这项工作探索了基于基于光纤的分布式声传感器（DAS）的新型数据源，以进行交通分析。检测车辆的类型和估计车辆的占用是其主要关注点。第一个是由于需要跟踪，控制和预测交通流的动机。第二个目标是对高占用车辆车道的调节，以减少排放和拥堵。这些任务通常是通过检查车辆或使用新兴计算机视觉技术来执行的。前者不可扩展或有效，而后者对乘客的隐私有侵入性。为此，我们提出了一种深度学习技术，以分析DAS信号，以通过连续感应和不暴露个人信息来应对这一挑战。我们提出了一种处理DAS信号的深度学习方法，并基于在受控条件下收集的DAS数据来实现92％的车辆分类准确性和92-97％的占用检测。

translated by 谷歌翻译